Airflow
Orchestration tool
Dynamic Dags
Possible since recent versions (source)
Task Groups
Ferramenta que agrupa dags dentro de blocos
t0 = DummyOperator(task_id='start')
# Start Task Group definition
with TaskGroup(group_id='group1') as tg1:
t1 = DummyOperator(task_id='task1')
t2 = DummyOperator(task_id='task2')
t1 >> t2
# End Task Group definition
t3 = DummyOperator(task_id='end')
# Set Task Group's (tg1) dependencies
t0 >> tg1 >> t3
Cross-dag dependencies
- TriggerDagRunOperator: ideal for downstream dependency
- ExternalTaskSensor: ideal for upstream dependency
- API
trigger_dependent_dag = TriggerDagRunOperator(
task_id="trigger_dependent_dag",
trigger_dag_id="dependent-dag",
wait_for_completion=True
)
Note: Since 2.1 we can see the dependencies set by option 1 and 2 in a separate view
Tips
- Avoid processing tasks on airflow. Delegate to other tools
Releases
2.1.0
- Add PythonVirtualenvDecorator to Taskflow API
- Add Taskgroup decorator
- Dag calendar view
- Cross-dag view
- Auto refresh tree view
- Allow celery workers without gossup or mingle modes
2.0.2
- Taskflow API
- REST API
- Scheduler improved (horizontal scaling)
- task Groups
- New UI
- Added Smart Sensors (replace external sensor?)
- Simplified Kubernetes Executor
- Split airflow packages into external
- Auto refresh view
- Faster webserver start up
1.10.15
- Fix bug on depends_on_past or task_concurrency stuck
- New new cli commands of 2.0
- Fix airflow db upgrade